Input Sparsity Time Low-rank Approximation via Ridge Leverage Score Sampling

نویسندگان

  • Michael B. Cohen
  • Cameron Musco
  • Christopher Musco
چکیده

Often used as importance sampling probabilities, leverage scores have become indispensable in randomized algorithms for linear algebra, optimization, graph theory, and machine learning. A major body of work seeks to adapt these scores to low-rank approximation problems. However, existing “low-rank leverage scores” can be difficult to compute, often work for just a single application, and are sensitive to matrix perturbations. We show how to avoid these issues by exploiting connections between low-rank approximation and regularization. Specifically, we employ ridge leverage scores, which are simply standard leverage scores computed with respect to an l2 regularized input. Importance sampling by these scores gives the first unified solution to two of the most important low-rank sampling problems: (1 + ǫ) error column subset selection and (1 + ǫ) error projection-cost preservation. Moreover, ridge leverage scores satisfy a key monotonicity property that does not hold for any prior low-rank leverage scores. Their resulting robustness leads to two sought-after results in randomized linear algebra. 1) We give the first input-sparsity time low-rank approximation algorithm based on iterative column sampling, resolving an open question posed in [LMP13], [CLM15], and [AM15]. 2) We give the first single-pass streaming column subset selection algorithm whose real-number space complexity has no dependence on stream length. Part of this work was completed while the author interned at Yahoo Labs, NYC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ridge Regression and Provable Deterministic Ridge Leverage Score Sampling

Ridge leverage scores provide a balance between low-rank approximation and regularization, and are ubiquitous in randomized linear algebra and machine learning. Deterministic algorithms are also of interest in the moderately big data regime, because deterministic algorithms provide interpretability to the practitioner by having no failure probability and always returning the same results. We pr...

متن کامل

Tighter Low-rank Approximation via Sampling the Leveraged Element

In this work, we propose a new randomized algorithm for computing a low-rank approximation to a given matrix. Taking an approach different from existing literature, our method first involves a specific biased sampling, with an element being chosen based on the leverage scores of its row and column, and then involves weighted alternating minimization over the factored form of the intended low-ra...

متن کامل

Provably Useful Kernel Matrix Approximation in Linear Time

We give the first algorithm for kernel Nyström approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions. The algorithm projects the kernel onto a set of s landmark points sampled by their ridge leverage scores, requiring just O(ns) kernel evaluations and O(ns) additional r...

متن کامل

Recursive Sampling for the Nystrom Method

We give the first algorithm for kernel Nyström approximation that runs in linear time in the number of training points and is provably accurate for all kernel matrices, without dependence on regularity or incoherence conditions. The algorithm projects the kernel onto a set of s landmark points sampled by their ridge leverage scores, requiring just O(ns) kernel evaluations and O(ns) additional r...

متن کامل

Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

Low-rank approximation is a common tool used to accelerate kernel methods: the n × n kernel matrix K is approximated via a rank-k matrix K̃ which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient lowrank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017